C++: Refactor of UncheckedLeapYearAfterModification#21292
C++: Refactor of UncheckedLeapYearAfterModification#21292bdrodes wants to merge 31 commits intogithub:mainfrom
Conversation
…cks (UncheckedLeapYearAfterYearModification). Switch to using 'postprocess' for unit tests.
…ion. Includes new logic for detecting leap year checks, new forms of leap year checks detected, and various heuristics to remove false postives. Move TimeConversionFunction into LeapYear.qll and refactored to separate conversion functions that are expected to be checked for failure from those that auto correct leap year dates if feb 29 is provided on a non-leap year. Increas the set of known TimeConversionFunctions.
…e negative remains.
…auto correct for leap year should be considered.
There was a problem hiding this comment.
Pull request overview
Refactors the UncheckedLeapYearAfterYearModification CodeQL query to significantly reduce false positives by introducing more precise flow/guard modeling and expanded heuristics around conversions and “ignorable” operations.
Changes:
- Reworked
UncheckedLeapYearAfterYearModification.qlinto a path-problem with new taint/dataflow-based modeling and multiple false-positive suppression heuristics. - Centralized/expanded time conversion API modeling in
LeapYear.qlland updated related query/tests/expected outputs accordingly. - Extended date/time type support (e.g.,
TIME_FIELDS) and added release change notes.
Reviewed changes
Copilot reviewed 9 out of 9 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| cpp/ql/test/query-tests/Likely Bugs/Leap Year/UncheckedLeapYearAfterYearModification/test.cpp | Adds many new positive/negative/edge test scenarios for the refactored query. |
| cpp/ql/test/query-tests/Likely Bugs/Leap Year/UncheckedLeapYearAfterYearModification/UncheckedReturnValueForTimeFunctions.expected | Updates expected results to match new line numbers/coverage. |
| cpp/ql/test/query-tests/Likely Bugs/Leap Year/UncheckedLeapYearAfterYearModification/UncheckedLeapYearAfterYearModification.qlref | Switches to inline-expectations postprocessing. |
| cpp/ql/test/query-tests/Likely Bugs/Leap Year/UncheckedLeapYearAfterYearModification/UncheckedLeapYearAfterYearModification.expected | Updates expected results to reflect path-problem output and new test suite. |
| cpp/ql/src/Likely Bugs/Leap Year/UncheckedReturnValueForTimeFunctions.ql | Refines conversion-return checking to exclude auto-leap-year-correcting APIs. |
| cpp/ql/src/Likely Bugs/Leap Year/UncheckedLeapYearAfterYearModification.ql | Full refactor: new flow configurations, guard recognition, and suppression heuristics. |
| cpp/ql/src/Likely Bugs/Leap Year/LeapYear.qll | Moves/expands TimeConversionFunction modeling and adds auto-correcting classification. |
| cpp/ql/lib/semmle/code/cpp/commons/DateTime.qll | Expands recognized unpacked time/date types and adds field-access classes for TIME_FIELDS. |
| cpp/ql/lib/change-notes/2026-02-06-UncheckedLeapYearAfterModification_Refactor | Adds changelog entry documenting the refactor and alert reduction. |
cpp/ql/test/query-tests/Likely Bugs/Leap Year/UncheckedLeapYearAfterYearModification/test.cpp
Show resolved
Hide resolved
cpp/ql/test/query-tests/Likely Bugs/Leap Year/UncheckedLeapYearAfterYearModification/test.cpp
Outdated
Show resolved
Hide resolved
cpp/ql/test/query-tests/Likely Bugs/Leap Year/UncheckedLeapYearAfterYearModification/test.cpp
Outdated
Show resolved
Hide resolved
cpp/ql/src/Likely Bugs/Leap Year/UncheckedLeapYearAfterYearModification.ql
Outdated
Show resolved
Hide resolved
cpp/ql/src/Likely Bugs/Leap Year/UncheckedLeapYearAfterYearModification.ql
Outdated
Show resolved
Hide resolved
cpp/ql/test/query-tests/Likely Bugs/Leap Year/UncheckedLeapYearAfterYearModification/test.cpp
Outdated
Show resolved
Hide resolved
cpp/ql/src/Likely Bugs/Leap Year/UncheckedLeapYearAfterYearModification.ql
Fixed
Show fixed
Hide fixed
cpp/ql/src/Likely Bugs/Leap Year/UncheckedLeapYearAfterYearModification.ql
Fixed
Show fixed
Hide fixed
cpp/ql/src/Likely Bugs/Leap Year/UncheckedLeapYearAfterYearModification.ql
Fixed
Show fixed
Hide fixed
cpp/ql/src/Likely Bugs/Leap Year/UncheckedLeapYearAfterYearModification.ql
Fixed
Show fixed
Hide fixed
cpp/ql/src/Likely Bugs/Leap Year/UncheckedLeapYearAfterYearModification.ql
Fixed
Show fixed
Hide fixed
cpp/ql/src/Likely Bugs/Leap Year/UncheckedLeapYearAfterYearModification.ql
Fixed
Show fixed
Hide fixed
cpp/ql/src/Likely Bugs/Leap Year/UncheckedLeapYearAfterYearModification.ql
Fixed
Show fixed
Hide fixed
cpp/ql/src/Likely Bugs/Leap Year/UncheckedLeapYearAfterYearModification.ql
Fixed
Show fixed
Hide fixed
cpp/ql/src/Likely Bugs/Leap Year/UncheckedLeapYearAfterYearModification.ql
Fixed
Show fixed
Hide fixed
cpp/ql/src/Likely Bugs/Leap Year/UncheckedLeapYearAfterYearModification.ql
Fixed
Show fixed
Hide fixed
geoffw0
left a comment
There was a problem hiding this comment.
Initial review - lots of small comments, though little in particular I have strong feelings about. I'm happy to see this query get an overhaul, I'm very happy to see lots of test cases, though there is more query code than I'd like from a readability perspective. It would be nice to have some kind of diagram of how all the flows to fit together to aid understanding.
If you make whichever improvements you feel are most valuable, then I expect to approve this. I do also want to briefly check:
- performance
- real world (MRVA) result changes
- CI
cpp/ql/test/query-tests/Likely Bugs/Leap Year/UncheckedLeapYearAfterYearModification/test.cpp
Outdated
Show resolved
Hide resolved
cpp/ql/test/query-tests/Likely Bugs/Leap Year/UncheckedLeapYearAfterYearModification/test.cpp
Show resolved
Hide resolved
cpp/ql/test/query-tests/Likely Bugs/Leap Year/UncheckedLeapYearAfterYearModification/test.cpp
Show resolved
Hide resolved
cpp/ql/src/Likely Bugs/Leap Year/UncheckedLeapYearAfterYearModification.ql
Show resolved
Hide resolved
cpp/ql/src/Likely Bugs/Leap Year/UncheckedLeapYearAfterYearModification.ql
Outdated
Show resolved
Hide resolved
cpp/ql/src/Likely Bugs/Leap Year/UncheckedLeapYearAfterYearModification.ql
Show resolved
Hide resolved
cpp/ql/src/Likely Bugs/Leap Year/UncheckedLeapYearAfterYearModification.ql
Show resolved
Hide resolved
cpp/ql/src/Likely Bugs/Leap Year/UncheckedLeapYearAfterYearModification.ql
Outdated
Show resolved
Hide resolved
cpp/ql/lib/change-notes/2026-02-06-UncheckedLeapYearAfterModification_Refactor.md
Outdated
Show resolved
Hide resolved
…ification.ql Co-authored-by: Geoffrey White <40627776+geoffw0@users.noreply.github.com>
…ication_Refactor.md Co-authored-by: Geoffrey White <40627776+geoffw0@users.noreply.github.com>
…w steps for two similar flows into a common additional step predicate.
https://github.com/microsoft/codeql into UncheckedLeaprYearAfterModification_Refactor_Upstream
…ification.ql Co-authored-by: Geoffrey White <40627776+geoffw0@users.noreply.github.com>
…ification.ql Co-authored-by: Geoffrey White <40627776+geoffw0@users.noreply.github.com>
Co-authored-by: Geoffrey White <40627776+geoffw0@users.noreply.github.com>
Co-authored-by: Geoffrey White <40627776+geoffw0@users.noreply.github.com>
Co-authored-by: Geoffrey White <40627776+geoffw0@users.noreply.github.com>
https://github.com/microsoft/codeql into UncheckedLeaprYearAfterModification_Refactor_Upstream
I think I've addressed all the comments. Are you able to run the DCA run? I'm not sure I have access to do so anymore. |
|
I've started a DCA run (though I'm not confident I remember the configuration that will work for CPP - we will see). |
geoffw0
left a comment
There was a problem hiding this comment.
All of my comments have been addressed to my satisfaction. I believe so have the Copilot comments. 👍
cpp/ql/src/Likely Bugs/Leap Year/UncheckedLeapYearAfterYearModification.ql
Outdated
Show resolved
Hide resolved
cpp/ql/src/Likely Bugs/Leap Year/UncheckedLeapYearAfterYearModification.ql
Outdated
Show resolved
Hide resolved
|
On the DCA run we gain 5 results and lose 38. To be honest I'm surprised this query had as many as 38 results given how narrow its purpose is, but there does seem to be a lot of date management code around. The new result set LGTM apart from a couple of FPs involving "magic numbers" 80 and 20, used to convert between various "zero" dates. The lost results are a wide assortment including many of the FP cases your changes cover. So good job! There is an overall 2% analysis time regression, which appears to be likely caused by a slowdown in this query. It doesn't "explode" on any particular project, so there's no crisis here, but I plan to take a look at what the query's spending that time doing... I also did a MRVA-100 run with similar results (if a less extreme improvement). There are a couple of FPs where we don't spot that |
I had a look at the query locally and I didn't get a consistent signal that anything is really wrong performance-wise. I've started a second DCA run to compare. |
…ification.ql Co-authored-by: Geoffrey White <40627776+geoffw0@users.noreply.github.com>
…ification.ql Co-authored-by: Geoffrey White <40627776+geoffw0@users.noreply.github.com>
I could add that, but in the cases you are seeing, what does the magic number 80 correspond to? Happy to add new number filters but I like to document why it is removed, ideally if I can tie it to some specific conversion or some specific datatype. |
Co-authored-by: Geoffrey White <40627776+geoffw0@users.noreply.github.com>
It seems to be about converting between "years since 1980" and "years since 1900" as best I can tell. e.g. here. I don't know how common "years since 1980" is but I did see this on another database as well. The 2% analysis performance regression persisted in the second DCA run. It's not a total blocker but it does suggest something could be wrong - and might have a worse impact on databases we haven't tested. I will try and find time to have another look at what is causing this. |
1980/01/01 is the start of the epoch on e.g. DOS, similar to what 1970/01/01 is on Unix. See also https://en.wikipedia.org/wiki/System_time#Operating_systems. |
Thanks, great reference! |
cpp/ql/src/Likely Bugs/Leap Year/UncheckedLeapYearAfterYearModification.ql
Outdated
Show resolved
Hide resolved
cpp/ql/src/Likely Bugs/Leap Year/UncheckedLeapYearAfterYearModification.ql
Outdated
Show resolved
Hide resolved
…ification.ql Co-authored-by: Geoffrey White <40627776+geoffw0@users.noreply.github.com>
…ification.ql Co-authored-by: Geoffrey White <40627776+geoffw0@users.noreply.github.com>
Added to the filter set. |
cpp/ql/src/Likely Bugs/Leap Year/UncheckedLeapYearAfterYearModification.ql
Outdated
Show resolved
Hide resolved
Address more non-ascii characters
|
@jketema how do we feel about the overall 2% analysis slowdown that merging this change will cause? The query is taking longer locally, it's definitely not quick any more, but I've not seen it anywhere close to timeouts and I've no reason to believe it scales worse than any other dataflow query. I can't really isolate a cause apart from the main dataflow ( |
2% is well within the usual noise ratio for DCA. |
It's definitely a bit more than just noise, but I think we should just accept this, as it seems to resolve a whole bunch of FPs. |
This PR addresses excessive false positive alerting from
Likely Bugs/Leap Year/UncheckedLeapYearAfterYearModification.ql. In separate in-depth auditing, the number of alerts drops from 40,000 down to 2,000 with these changes, with a much higher rate of true positives, though still too high to be considered more than medium precision (~25% false positive rate remaining).This PR is a complete refactor of the original query to address the false positives observed on production code. Some of the lessons learned here could be extrapolated into the LeapYear.qll library, but leaving changes like this for future PRs.